Goto

Collaborating Authors

 task-specific skill


Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation

Liu, Sicong, Shu, Yang, Guo, Chenjuan, Yang, Bin

arXiv.org Artificial Intelligence

Learning cooperative multi-agent policy from offline multi-task data that can generalize to unseen tasks with varying numbers of agents and targets is an attractive problem in many scenarios. Although aggregating general behavior patterns among multiple tasks as skills to improve policy transfer is a promising approach, two primary challenges hinder the further advancement of skill learning in offline multi-task MARL. Firstly, extracting general cooperative behaviors from various action sequences as common skills lacks bringing cooperative temporal knowledge into them. Secondly, existing works only involve common skills and can not adaptively choose independent knowledge as task-specific skills in each task for fine-grained action execution. To tackle these challenges, we propose Hierarchical and Separate Skill Discovery (HiSSD), a novel approach for generalizable offline multi-task MARL through skill learning. HiSSD leverages a hierarchical framework that jointly learns common and task-specific skills. The common skills learn cooperative temporal knowledge and enable in-sample exploitation for offline multi-task MARL. The task-specific skills represent the priors of each task and achieve a task-guided fine-grained action execution. To verify the advancement of our method, we conduct experiments on multi-agent MuJoCo and SMAC benchmarks. After training the policy using HiSSD on offline multi-task data, the empirical results show that HiSSD assigns effective cooperative behaviors and obtains superior performance in unseen tasks.


Customizable Combination of Parameter-Efficient Modules for Multi-Task Learning

Wang, Haowen, Sun, Tao, Fan, Cong, Gu, Jinjie

arXiv.org Artificial Intelligence

Modular and composable transfer learning is an emerging direction in the field of Parameter Efficient Fine-Tuning, as it enables neural networks to better organize various aspects of knowledge, leading to improved cross-task generalization. In this paper, we introduce a novel approach Customized Polytropon (C-Poly) that combines task-common skills and task-specific skills, while the skill parameters being highly parameterized using low-rank techniques. Each task is associated with a customizable number of exclusive specialized skills and also benefits from skills shared with peer tasks. A skill assignment matrix is jointly learned. To evaluate our approach, we conducted extensive experiments on the Super-NaturalInstructions and the SuperGLUE benchmarks. Our findings demonstrate that C-Poly outperforms fully-shared, task-specific, and skill-indistinguishable baselines, significantly enhancing the sample efficiency in multi-task learning scenarios. As the number of parameters in Large Language Models (LLMs) continues to grow, training these models efficiently with limited computational resources has become a challenge. In recent years, there has been a shift towards employing Parameter Effective Fine-Tuning (PEFT) methods to address this issue. Examples of such methods include LoRA (Hu et al., 2022), AdaLoRA (Zhang et al., 2023a), and (IA) These methods focus on fine-tuning the adapter while freezing the pre-trained model, effectively reducing the computational cost.


Why video games and board games aren't a good measure of AI intelligence

#artificialintelligence

Measuring the intelligence of AI is one of the trickiest but most important questions in the field of computer science. If you can't understand whether the machine you've built is cleverer today than it was yesterday, how do you know you're making progress? At first glance, this might seem like a non-issue. "Obviously AI is getting smarter" is one reply. "Just look at all the money and talent pouring into the field. Look at the milestones, like beating humans at Go, and the applications that were impossible to solve a decade ago that are commonplace today, like image recognition. How is that not progress?"